Fix/export workspace path override by clean6378-max-it · Pull Request #114 · cppalliance/cppa-cursor-browser

clean6378-max-it · 2026-06-23T01:41:22Z

Closes #107

Summary

Replaces the CLI --base-dir workspace-path bypass that mutated os.environ["WORKSPACE_PATH"] with an explicit resolve_workspace_path(override=...) parameter. Stacks on Monday PR 1 (refactor/export-engine-consolidation); the core override plumbing landed there during review — this PR adds dedicated regression tests and a doc comment fix.

Sprint item Chen § #4 (3 pt, Medium).

Changes

utils/workspace_path.py — resolve_workspace_path(*, override: str | None = None); call-time override takes precedence over module override, env, and default; does not mutate WORKSPACE_PATH
scripts/export.py — resolve_workspace_path(override=opts.get("base_dir")); os.environ["WORKSPACE_PATH"] mutation removed
tests/test_export_base_dir_override.py (new) — asserts --base-dir is passed as override= and leaves WORKSPACE_PATH unchanged
tests/test_workspace_path_thread_safety.py — explicit-override precedence test (from PR 1 stack)

Out of scope

Export-engine consolidation (Monday PR 1 — Fix multi-directory workspace resolution and chat-to-project assignment #3)
Web server WORKSPACE_PATH lock semantics

Acceptance criteria

scripts/export.py passes --base-dir explicitly to resolve_workspace_path() instead of mutating os.environ
os.environ["WORKSPACE_PATH"] mutation line removed
resolve_workspace_path() accepts optional override parameter
Existing export / workspace_path / thread-safety tests pass
Thread-safety tests unaffected
PR approved by at least 1 reviewer

Test plan

mypy --strict utils/workspace_path.py scripts/export.py
pytest tests/test_export_base_dir_override.py tests/test_workspace_path_thread_safety.py tests/test_cli_export_e2e.py -q
pytest -k "workspace_path or export or thread" -q
python scripts/export.py --base-dir <workspaceStorage> --help

Merge note

Base branch: refactor/export-engine-consolidation (PR 1 must merge first, or merge both in order).

Summary by CodeRabbit

Bug Fixes
- Improved export robustness by skipping invalid composer rows and treating corrupt stored values more safely.
- Fixed CLI session exports to derive session creation timestamps from provided metadata when available.
Improvements
- Improved workspace project listing performance by adding cache-hit behavior when caching is enabled.
- Enhanced --base-dir handling to correctly override workspace path resolution without permanently changing WORKSPACE_PATH.
Tests
- Added regression coverage for --base-dir override behavior and environment variable preservation.

coderabbitai · 2026-06-23T01:41:35Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9d5e46bb-da4d-41b1-b3b4-770e8d59a041

📥 Commits

Reviewing files that changed from the base of the PR and between dbec81e and 0587356.

📒 Files selected for processing (1)

services/export_engine.py

🚧 Files skipped from review as they are similar to previous changes (1)

services/export_engine.py

📝 Walkthrough

Walkthrough

Adds a keyword-only override parameter to resolve_workspace_path to eliminate os.environ mutation in the CLI export path. Fixes TypeError handling during IDE composer JSON parsing and corrects CLI session createdAt timestamp conversion via to_epoch_ms. Refactors list_workspace_projects to use fingerprint-based cache lookups with pre-collected entries. Adds regression tests for the override parameter behavior.

Changes

Workspace Path Override, Export Engine Fixes, and Workspace Listing Caching

Layer / File(s)	Summary
`resolve_workspace_path` override parameter and regression tests `utils/workspace_path.py`, `tests/test_export_base_dir_override.py`	Adds `override: str \| None = None` to `resolve_workspace_path` as a call-time override with highest resolution precedence. Updates module comments to document `POST /api/set-workspace` vs. CLI `--base-dir` distinction. Regression tests verify `override` is forwarded and `WORKSPACE_PATH` env var is not mutated.
Export engine robustness: composer parsing and session timestamps `services/export_engine.py`, `api/export_api.py`	IDE composer export validates `key` contains `:` delimiter and skips rows with `TypeError` in JSON parsing. CLI session export converts `meta["createdAt"]` through `to_epoch_ms` instead of raw epoch-millis. `api/export_api.py` receives non-functional formatting normalization.
Workspace listing fingerprint-based caching and orchestration refactor `services/workspace_listing.py`	Adds `fingerprint_workspace_storage` and `global_storage_db_path` imports. Refactors `list_workspace_projects` to pre-collect entries, compute fingerprint, and return early on cache hit. Rewrites `_build_workspace_projects_uncached` to accept `WorkspaceOrchestration` and derive `ctx`/`workspace_entries` from it.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

cppa-cursor-browser: Export script — fix os.environ workspace path bypass #107: This PR directly implements the override parameter for resolve_workspace_path and adds regression tests that satisfy the stated acceptance criteria for eliminating the os.environ mutation in the CLI export path.

Possibly related PRs

cppalliance/cppa-cursor-browser#54: Both PRs modify utils/workspace_path.py around resolve_workspace_path reading _workspace_path_override under a lock; the earlier PR introduced the lock semantics and this PR adds the call-time override parameter above it.
cppalliance/cppa-cursor-browser#112: Both PRs modify services/export_engine.py and services/workspace_listing.py; this PR applies incremental robustness fixes and orchestration refactoring to the shared export engine and listing logic extracted in #112.
cppalliance/cppa-cursor-browser#61: Both PRs modify services/workspace_listing.py around list_workspace_projects and its orchestration flow; this PR refactors the caching and orchestration integration while the earlier PR focused on loader helpers.

Suggested reviewers

bradjin8
wpak-ai
timon0305

Poem

🐇 Hoppity-hop, no more os.environ to poke,
The override param means no global env smoke.
TypeError caught, and createdAt now aligned,
The fingerprint cache leaves stale results behind.
This bunny approves — clean paths, no sleight of hand! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: fixing the export script's workspace path override mechanism by replacing environment mutation with explicit parameter passing.
Linked Issues check	✅ Passed	All acceptance criteria from issue `#107` are met: resolve_workspace_path() accepts override parameter, scripts/export.py passes --base-dir explicitly, environment mutation is eliminated, and new regression tests verify the behavior.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#107` scope: workspace path override parameter, export script updates, and regression tests. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/export-workspace-path-override

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

services/workspace_listing.py (1)
93-100: 🚀 Performance & Scalability | 🔵 Trivial | 🏗️ Heavy lift

Keep project-cache hits from doing full orchestration work.

prepare_workspace_orchestration() resolves workspace context and display maps before get_cached_projects(), but cache hits only need the fingerprint. Consider splitting fingerprint preparation from full orchestration or making the context/display-map fields lazy so cached workspace listings return cheaply.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/workspace_listing.py` around lines 93 - 100, The issue is that
prepare_workspace_orchestration() performs expensive work resolving workspace
context and display maps before checking get_cached_projects() for a cache hit.
Restructure the logic to compute only the fingerprint first (before full
orchestration), check get_cached_projects() with that fingerprint, and only
invoke the complete prepare_workspace_orchestration() if the cache check misses,
avoiding unnecessary work when returning cached results.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/export.py`:
- Around line 171-180: In the _read_last_export_ms function, after loading the
JSON with json.load(f) into the st variable, add a type validation check to
ensure st is a dictionary before calling st.get("lastExportTime"). If st is not
a dictionary (for example, if the JSON file contains an array, string, or
number), return 0 to treat it as unreadable state and fall back to full export.
You can either add an isinstance(st, dict) check before the st.get() call or add
AttributeError to the exception handler tuple to catch this case.

In `@services/export_engine.py`:
- Around line 371-382: The createdAt value from meta.get("createdAt") is used
directly without normalization in the assignment to created_ms, but this value
may be in seconds, an ISO string, or milliseconds depending on the source. This
causes issues in the max() comparison with db_mtime_ms (which is in
milliseconds) and elsewhere in the code like messages_to_bubbles() and filename
timestamps. Normalize the createdAt value to milliseconds before assigning it to
created_ms by adding logic that detects and converts seconds timestamps, ISO
strings, or other formats to milliseconds before the fallback to
datetime.now().timestamp() * 1000.
- Around line 217-225: The exception handler for json.loads(row["value"]) in the
try-except block is missing TypeError, which can be raised when row["value"] is
a non-string type like None or int. Update the except clause that currently
catches (json.JSONDecodeError, ValueError) to also include TypeError in the
tuple of exceptions, ensuring that malformed non-string values are logged and
skipped consistently with other similar error handling patterns in the codebase.

---

Nitpick comments:
In `@services/workspace_listing.py`:
- Around line 93-100: The issue is that prepare_workspace_orchestration()
performs expensive work resolving workspace context and display maps before
checking get_cached_projects() for a cache hit. Restructure the logic to compute
only the fingerprint first (before full orchestration), check
get_cached_projects() with that fingerprint, and only invoke the complete
prepare_workspace_orchestration() if the cache check misses, avoiding
unnecessary work when returning cached results.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f08194c8-1cc4-4c57-919a-1b3fabf9336f

📥 Commits

Reviewing files that changed from the base of the PR and between ee6a6e7 and 196f2b6.

📒 Files selected for processing (10)

api/export_api.py
pyproject.toml
scripts/export.py
services/export_engine.py
services/workspace_listing.py
tests/test_api_export.py
tests/test_export_base_dir_override.py
tests/test_export_engine.py
tests/test_workspace_path_thread_safety.py
utils/workspace_path.py

💤 Files with no reviewable changes (1)

pyproject.toml

coderabbitai

🧹 Nitpick comments (1)

services/workspace_listing.py (1)
96-118: 🚀 Performance & Scalability | 🔵 Trivial

Duplicated fingerprint computation introduces a latent cache-key divergence risk.

The inline block (lines 100–108) recomputes global_storage_db_path, get_cli_chats_path, and fingerprint_workspace_storage identically to how prepare_workspace_orchestration does so internally (export_engine.py lines 129–135). On a cache miss this duplicates I/O, and more critically, the read key (fingerprint, line 109) and write key (orch.fingerprint, line 124) are produced by two separate copies of the same logic. They are byte-identical today, but if either site's conditional gating (e.g., os.path.isfile/os.path.isdir) or the rules argument ever change, cache reads and writes would key on different fingerprints, silently breaking the cache.

Consider threading the precomputed fingerprint into prepare_workspace_orchestration to maintain a single source of truth while preserving the cheap cache-hit short-circuit before the expensive ctx/display-map resolution.
♻️ Sketch: pass the computed fingerprint through
     orch = prepare_workspace_orchestration(
         workspace_path,
         rules,
         nocache=effective_nocache,
         workspace_entries=workspace_entries,
+        fingerprint=fingerprint if not effective_nocache else None,
     )
This requires adding a keyword-only fingerprint: dict[str, Any] | None = None parameter to prepare_workspace_orchestration that, when provided, is used instead of recomputing. That keeps a single fingerprint definition and removes the duplicate stat lookups on the cache-miss path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/workspace_listing.py` around lines 96 - 118, The fingerprint
computation in the inline block (lines 100-108) duplicates the identical logic
that prepare_workspace_orchestration performs internally, creating a cache-key
divergence risk if either implementation changes. Add a keyword-only fingerprint
parameter to prepare_workspace_orchestration with a default value of None, then
modify the function to use the provided fingerprint when available instead of
recomputing it. In the calling code within workspace_listing.py, pass the
computed fingerprint value to prepare_workspace_orchestration to eliminate the
duplication while preserving the cache-hit short-circuit before expensive
orchestration operations.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@services/workspace_listing.py`:
- Around line 96-118: The fingerprint computation in the inline block (lines
100-108) duplicates the identical logic that prepare_workspace_orchestration
performs internally, creating a cache-key divergence risk if either
implementation changes. Add a keyword-only fingerprint parameter to
prepare_workspace_orchestration with a default value of None, then modify the
function to use the provided fingerprint when available instead of recomputing
it. In the calling code within workspace_listing.py, pass the computed
fingerprint value to prepare_workspace_orchestration to eliminate the
duplication while preserving the cache-hit short-circuit before expensive
orchestration operations.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa81b06d-e9a9-448b-abd6-114fc4760a91

📥 Commits

Reviewing files that changed from the base of the PR and between 196f2b6 and d1d8101.

📒 Files selected for processing (3)

scripts/export.py
services/export_engine.py
services/workspace_listing.py

🚧 Files skipped from review as they are similar to previous changes (2)

services/export_engine.py
scripts/export.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

services/export_engine.py (1)

386-392: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Guard session["meta"] shape before .get() calls.

At Lines 386-392, if meta is present but not a dict (or None), this raises AttributeError before the session-level read/parse guard, causing the whole export to fail on one bad session.

Suggested fix

-            meta = session.get("meta", {})
+            raw_meta = session.get("meta")
+            meta = raw_meta if isinstance(raw_meta, dict) else {}
             session_id = session["session_id"]
             created_raw = meta.get("createdAt")
             created_ms = to_epoch_ms(created_raw) if created_raw else int(
                 datetime.now().timestamp() * 1000,
             )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/export_engine.py` around lines 386 - 392, The code retrieves `meta`
from the session dictionary but does not validate that it is actually a
dictionary before calling `.get()` on it. If the session contains a non-dict
value for the meta key (such as None or a string), calling
`meta.get("createdAt")` or `meta.get("name")` will raise an AttributeError. Add
a type guard to check that `meta` is a dictionary before using its `.get()`
methods, defaulting it to an empty dictionary if it is not the correct type, so
that subsequent `.get()` calls operate safely without raising AttributeError.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@services/export_engine.py`:
- Line 233: The composer_id extraction on row["key"].split(":")[1] will raise an
IndexError if the key doesn't contain a colon, causing the entire export to
abort instead of gracefully handling malformed rows. Add validation before
splitting to check if the key contains the colon separator, and implement proper
error handling to skip malformed rows or provide a fallback value so the export
can continue processing remaining rows without interruption.

---

Duplicate comments:
In `@services/export_engine.py`:
- Around line 386-392: The code retrieves `meta` from the session dictionary but
does not validate that it is actually a dictionary before calling `.get()` on
it. If the session contains a non-dict value for the meta key (such as None or a
string), calling `meta.get("createdAt")` or `meta.get("name")` will raise an
AttributeError. Add a type guard to check that `meta` is a dictionary before
using its `.get()` methods, defaulting it to an empty dictionary if it is not
the correct type, so that subsequent `.get()` calls operate safely without
raising AttributeError.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f1e18b2d-d529-43cd-873f-05d7721079cc

📥 Commits

Reviewing files that changed from the base of the PR and between 14c256f and dbec81e.

📒 Files selected for processing (5)

api/export_api.py
services/export_engine.py
services/workspace_listing.py
tests/test_export_base_dir_override.py
utils/workspace_path.py

🚧 Files skipped from review as they are similar to previous changes (3)

tests/test_export_base_dir_override.py
utils/workspace_path.py
services/workspace_listing.py

… meta

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

Comment thread scripts/export.py Outdated

Comment thread services/export_engine.py

Comment thread services/export_engine.py Outdated

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

timon0305 requested changes Jun 23, 2026

View reviewed changes

Comment thread api/export_api.py

Comment thread tests/test_export_base_dir_override.py

Comment thread scripts/export.py

Comment thread services/export_engine.py Outdated

Comment thread services/export_engine.py Outdated

clean6378-max-it mentioned this pull request Jun 23, 2026

Extract shared export engine; wire CLI to summary cache and drop mypy… #112

Merged

11 tasks

clean6378-max-it requested a review from timon0305 June 23, 2026 18:38

clean6378-max-it added 4 commits June 24, 2026 03:24

Address export consolidation review findings

0da9a4d

Add regression tests for --base-dir workspace path override

96d196c

Harden export state parsing and listing cache fast path

ee9e90f

Flatten nested mocks in base_dir override tests

dbec81e

clean6378-max-it force-pushed the fix/export-workspace-path-override branch from 14c256f to dbec81e Compare June 23, 2026 19:26

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

Comment thread services/export_engine.py Outdated

timon0305 approved these changes Jun 23, 2026

View reviewed changes

timon0305 requested a review from wpak-ai June 23, 2026 19:31

Harden export_engine: skip malformed composer keys, guard CLI session…

0587356

… meta

wpak-ai approved these changes Jun 23, 2026

View reviewed changes

wpak-ai merged commit ae468e9 into master Jun 23, 2026
16 checks passed

wpak-ai deleted the fix/export-workspace-path-override branch June 23, 2026 20:29

Conversation

clean6378-max-it commented Jun 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Out of scope

Acceptance criteria

Test plan

Merge note

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clean6378-max-it commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading